Cross-language Transfer of Multilingual Phoneme Models
نویسندگان
چکیده
We present a method to use speech data from multiple languages to enhance the performance of a flexible vocabulary command word recognizer which is trained using a small amount of speech data of the target language. We develop data-driven approaches for identification of multilingual phoneme units and mapping of these units to the target language phonemes, and evaluate them against the knowledge based approach of mapping identical SAMPA phoneme symbols. The usefulness of multilingual context dependent phoneme modeling for cross-language transfer is shown. Our method achieves significant improvement of recognition performance in the target languages Danish and English by cross-language transfer of multilingual models trained on French, German, Italian, Portuguese and Spanish speech if phonetically rich target language speech data by less than 100 speakers of roughly 1/2 minute duration per speaker is available.
منابع مشابه
Integrating Thai grapheme based acoustic models into the ML-MIX framework - for language independent and cross-language ASR
Grapheme based speech recognition is a powerful tool for rapidly creating automatic speech recognition (ASR) systems in new languages. For purposes of language independent or cross language speech recognition it is necessary to identify similar models in the different languages involved. For phoneme based multilingual ASR systems this is usually achieved with the help of a language independent ...
متن کاملCross-Linguistic Transfer or Target Language Proficiency: Writing Performance of Trilinguals vs. Bilinguals in Relation to the Interdependence Hypothesis
This study explored the nature of transfer among bilingual vs. trilinguals with varying levels of competence in English and their previous languages. The hypotheses were tested in writing tasks designed for 75 high (N= 35) vs. intermediate (N=40) proficient EFL learners with Turkish, Persian, English and Persian, English linguistic backgrounds. Qualitative data were also collected through some ...
متن کاملAdaptation of Pronunciation Dictionaries for Recognition of Unseen Languages
This paper studies the relative effectiveness of different methods for multilingual model combination and dictionary mapping for recognizing a new unseen target language if training data are limited. We examine the crosslanguage transfer from monolingual and multilingual models to German and Russian language for large vocabulary speech recognition using a dictation database which has been colle...
متن کاملDevelopment of Multilingual Acoustic Models in the GlobalPhone Project
This paper describes our recent eeort in developing the Glob-alPhone recognizer for multilingual large vocabulary continuous speech. Turkish. Based on ve languages we developed a global phoneme set and built multilingual speech recognizer by variing the method of acoustic model combination. Context dependent phoneme models are created using questions about languages and language groups. Results...
متن کاملLanguage independent and language adaptive large vocabulary speech recognition
This paper describes the design of a multilingual speech recognizer using an LVCSR dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Tu...
متن کامل